Older adults’ lower-limb muscle power production throughout a full flight of stairs: Reliability and comparison between different stair models

Lower-limb muscle power should be closely monitored to prevent age-related functional ability declines. Stair-climbing (SC) power is a functionally relevant measurement of lower-limb muscle power. Body-fixed sensors can measure power production throughout the different steps of a flight of stairs to assess different aspects of performance. This study investigated: 1) power production throughout a full flight of stairs; 2) if staircases with less or more steps can provide similar information; and 3) test-retest reliability of SC power. 116 community-dwelling older adults (57 women) ascended three staircases as fast as possible: 12, 6 and 3 steps. Mean vertical power production per step was collected and analyzed using a commercial body-fixed sensor and software. Three phases were found in SC power production: 1) an acceleration phase, i.e., the power produced in step 1 (P1); 2) a phase where the highest performance (Pmax) is reached and; 3) a fatiguing phase with power loss (Ploss; only measurable on 12-step staircase). Mean power (Pmean) over the different steps was also evaluated. P1 did not differ between staircases (all p>0.05), whereas Pmax and Pmean were higher with increasing number of steps (p = 0.073 –p<0.001). P1, Pmax and Pmean were strongly correlated between staircases (r = 0.71–0.95, p<0.05). and showed good to excellent reliability (ICC = 0.66–0.95, p<0.05). Ploss showed poor reliability. To conclude, measurements of SC power production (P1, Pmax and Pmean) with a single sensor on the lower back are reliable across different staircases. A small, transportable, 3-step staircase can be used for measuring power production in clinical practices with no access to regular staircases. However, absolute values are dependent on the number of steps, indicating that measurements to track performance changes over time should always be done using an identical stair model.


Introduction
Aging is associated with declines in lower-limb muscle strength and power [1][2][3].Muscle power (i.e., force x velocity) starts to decline earlier in life and at a faster rate compared to muscle mass and strength [4][5][6].Low values of muscle power in old age have been associated with mobility limitations [7,8] and mortality [9,10], stressing the importance to include muscle power assessments to prevent these negative outcomes.
Golden standard methodologies for measuring lower-limb muscle power (e.g. a leg press device) are quite complex, which limits applicability in daily life [5,11].Clinically feasible measurements of lower-limb muscle power are needed for implementing muscle power assessments in daily-life practice [12].Lower-limb muscle power can be measured during activities of daily living, like standing up from a chair [13] or stair climbing (SC) [14,15].The ability to climb stairs is essential for independent living in community environments [16,17].Furthermore, SC can be considered as a safe and feasible test in community-dwelling older adults [18].Bean and his colleagues (2007) used the so called 'SC power test' or SCPT, which is an estimation of mean SC power based on the time needed to ascent the stairs (measured with a stopwatch), the vertical height of the stair and the subject's body mass.The SCPT on a 10-step staircase was compared with power production on a leg press device, showing a correlation of 0.52 [14].However, large staircases are not often available in clinical settings.For this reason, Ni and colleagues (2017) developed a 4-step stair model.Results showed excellent reliability (ICC = 0.95) and strong criterion validity (r = 0.85-0.86) of the 4-step SCPT [19] compared to a leg press device in a sample of 50 community-dwelling older men and women.
Although the SCPT is easy to use and enables frequent muscle power measurements, the mathematic equation only estimates a mean power value across the full flight of stairs [14].In reality, power production can differ from step to step in stair climbing.The lack of kinematic data does not allow measuring power fluctuations throughout a flight of stairs, which can give valuable information on different aspects of performance, e.g.initial acceleration, maximum performance, fatigue-related drop in performance.
Kinematic data can be collected by means of body-fixed sensors, providing data on power production during every step of a stair [15,20].We have previously demonstrated that sensorbased SC power is highly related (r = 0.80) to leg-extensor power in a group of young, middleaged and older adults [15].To the author's knowledge, no research has investigated power production throughout the different steps on a regular staircase of 10-15 steps.It is also unknown whether staircase models with a smaller number of steps can provide similar information as a regular staircase.This is important to verify, considering that many clinical settings have limited space or no access to a full flight of stairs.Therefore, the current study aimed at: 1) investigating power production per step throughout a full flight of stairs by using a body-fixed sensor and defining parameters of interest; 2) comparing SC performance on different stair models in community-dwelling older adults; and 3) evaluating the test-retest reliability of power parameters during SC.

Subjects and study design
Community-dwelling men and women aged 65 years and older were recruited via local advertisements to participate in a cross-sectional study on stair-climbing performance.The following exclusion criteria were applied: unable to climb stairs, unstable cardiovascular disease, dementia, recent surgery, infection and musculoskeletal injury.Our aim was to recruit a minimum of N = 100 (50 per sex) subjects.In total, 116 subjects agreed to participate in the study.Sensitivity analyses in G*Power indicated that a sample of N = 116 is able to detect a small effect size (f = 0.11) in a repeated measures ANOVA (power = 0.95; α 0.05; 2 groups (sex); 3 measurements (3-6-12 step stair)).To assess the test-retest reliability of stair-climbing parameters, 31 men and 22 women participated in a retest within 2 to 6 days after the initial test Once the access request is received, the RDM support staff will ask for additional information using a standardized form.Once the necessary information is supplied by the requester of access, such as email, first name, last name, affiliation (university, faculty, department or laboratory), and full address of the requester in addition to research-related information such as the project title, the intended research question or purpose, possible involved third parties, information on experience in relation to the feasibility of the data reuse, and an indication of ethical risks or possible issues that come with the reuse, the information, once received by the RDM support staff, will be passed on to the dataset's contact person and it will be sent to KU Leuven's Legal Department or Data Access Committee who decide on granting access or not and will draw up the Data Transfer Agreement granting access can take place.After the necessary signatures have been provided on the data transfer agreement, access to the data is granted and provided by the RDM support staff.The entire access request procedure is managed and overseen by the RDM support staff at KU Leuven, who can always be contacted with additional questions or for more information at rdm@kuleuven.be.
session.This sample size for test-retest reliability was based on Hopkins et al. [21], indicating that at least 50 subjects is recommended.Both sessions occurred in the same lab and under guidance of the same researcher.The study was approved by the Ethics Committee Research UZ/KU Leuven in accordance with the Declaration of Helsinki (S62540).All subjects provided a written informed consent.Measurements were performed from January 2020 until March 2022.

Anthropometric and demographic measurements
Body weight (kg) and height (m) were determined with a basic scale device and a standard stadiometer.Body mass index (BMI; kg/m 2 ) was calculated with following formula: weight/ (height 2 ).Education level and presence of chronic conditions were collected by means of a questionnaire.Functional status was determined with the Short Physical Performance Battery (SPPB), as described elsewhere [22].

Stair-climbing test
Subjects were instructed to ascend three different stair models as fast as possible without using the handrail and without skipping a step: a reference staircase of 12 steps, a shorter staircase of 6 steps and a newly developed 3-step stair model (Fig 1).To ensure maximal performance, subjects were asked not to stop abruptly at the top of the stairs but to take two extra steps before standing still.Step dimensions were similar across the different stairs (18 cm height).After a familiarization trial, subjects performed three trials (30-60s rest between trials) of each stair model and the order of the stair models was randomized.Two to five minutes of rest were provided in between conditions.During the test, subjects wore an elastic belt around their waist that positioned a sensor (DynaPort MoveTest, McRoberts, The Hague, NL) on the middle of the lower back.The sensor included a tri-axial accelerometer and gyroscope and was positioned so that it was close to the body's center of mass.The sensor had a sampling rate of 100 Hz and commercially available software (DynaPort MoveTest, McRoberts, The Hague, NL) was used to analyze the data.Every single step during stair ascent was analyzed.A step was divided in two subphases based on the vertical displacement: a rise phase, where there is vertical displacement, and a support or stance phase, where there is no vertical displacement.A more detailed description of this methodology can be found elsewhere [15].Total ascent duration (s) was calculated as the sum of the durations for all steps of each stair model.Instantaneous power (W) was calculated as body mass x (vertical acceleration + 9.81m/s 2 ) x vertical velocity x cos (angle between the vertical velocity and vertical force vector).Mean power values (P, in W) were calculated for each single rise phase according to previous procedures [15].All power values were divided by the individual's body mass to obtain relative power values (W/kg).Only the best trial of each stair model, based on lowest total duration, was included in the analyses.
To note, we first explored the data on mean power production throughout the different steps of a flight of stairs (see results section).Based on this exploration, parameters of interest were defined for the remaining statistical analyses.These parameters are described in the results section.

Statistical analysis
Statistical analyses were performed with SPSS (version 28.0.1.1)and Rstudio (version 1.4.1564).Level of significance was set at p < 0.05.To examine stair climbing power production throughout a flight of stairs, relative mean power values were plotted for every single step.The 12-step staircase was identified as the reference stair, given that-in most houses-staircases have about 12 steps to move from one floor to another.Standard descriptive statistics (means ± SD's) were used to describe the data.
To examine differences on power and duration parameters between the three stair models (3-6-12 steps) and between sex, linear mixed models were built using the function lmer provided by the R-package lme4.Stair model and sex were implemented as fixed effects in the regression models (model 1).If there was a main effect of stair model and sex, a second model was built by including the interaction term 'stair model-by-sex' as fixed effect in the regression model (model 2).Subject was included in the models as random effect to correct for the repeated measures design.
To examine the relationship between the three different stairs, power and duration parameters were compared with Pearson's correlation coefficients (Pearson's r) (separately for men and women).Pearson's r can be interpreted as <0.39 weak, 0.40-0.69moderate, 0.70-0.89strong and 0.90-1.00very strong correlation [23].
For the reliability analyses of the subsample, mean differences and SD's were calculated.Relative reliability was determined using intraclass correlation coefficients (ICC 3,1 ) and their 95% confidence intervals (95% CI).ICC's were interpreted as <0.50 poor, 0.50-0.75moderate, 0.75-0.90good and >0.90 excellent reliability [24].Absolute reliability was determined using the coefficient of variation (CV) and the minimal detectable change (MDC), calculated via following formulas: CV (%) = 100 x [2x (SDdifference / p 2)/(Mean1 + Mean2)] and MDC = SEM x 1.96 x p 2 (with SEM calculated as the square root of the residual mean square error from the repeated measures analysis of variance).MDC values were expressed as a percentage of the mean (MDC%) to make easy comparisons possible.

Results
Descriptives of the study sample can be found in Table 1.two steps (6-and 12-step model) of a stair test should not be used as an indicator of performance.

Description of power production throughout a flight of stairs
The reference stair of 12 steps enables us to distinguish between three phases in SC performance: 1) an acceleration phase, where power is build up after the start of the movement in step 1 (P 1 ); 2) a phase where the highest performance (P max ) is reached and; 3) a fatiguing phase, where there is a loss in power (P loss ), here calculated as the percentual difference between the relative mean power on step 4-5 and the relative mean power of step 9-10.To note, for the calculation of P loss two steps were averaged so that left-right differences would not influence the calculation of the parameter.In the 3-and 6-step stair model, only P 1 and P max can be estimated.

Comparison between different stair models
Descriptives of P 1 , P max and P loss can be found in Table 2 for men and women separately.We additionally report the overall mean power over the different steps (P mean ), as this is the only parameter that can be estimated based on total duration (as done in previous work with estimation formulas [14]).
Women demonstrated lower values for P 1 , P max , and P mean and higher values for total duration (all p < 0.001).No sex difference was found for P loss (p = 0.289).P 1 did not differ between the different stair models (all p>0.05), whereas P max , P mean and total duration were higher with increasing number of steps (all p<0.001, except for the difference in P max and P mean between 6 vs 12 step (p = 0.073 and p = 0.054)) (Table 2).A significant stair model-by-sex interaction effect was found for P max , indicating a lower difference between 3 vs 6 step (p = 0.003) and 3 vs 12 step (p<0.001) in women than in men.Likewise, the difference in P mean between 3 vs 12 step (p<0.001) and 6 vs 12 step (p = 0.019) was lower in women than in men.Pearson's r for power variables varied between 0.71 and 0.83 for P 1 and between 0.82 and 0.95 for P max and P mean , which showed moderate to very strong correlations between the different stair models (3-6-12 step) (Table 3).
Within the different stair models, P 1 and P max showed moderate to strong relationships (r between 0.64 and 0.78, all p <0.001).In the 12-step stair model, P loss was significantly related to P 1 in both sexes (r = -0.27 in women and r = -0.45 in men; p <0.05) and to P max in men only (r = -0.45,p <0.001).

Test-retest reliability
Mean differences for SC parameters between test and retest session are displayed in Table 4. P 1 , P max and P mean showed good to excellent relative reliability in all staircases, with ICC's varying between 0.79 and 0.95 (only P 1 of the 12-step stair showed moderate relative reliability with an ICC of 0.66).P mean and P max showed good absolute reliability, with CV's ranging between 6.63-9.85%and MDC's ranging between 2.30-3.53%.P 1 showed high CV's (12.61%, 14.51% and 17.72% for the 3-, 6-and 12-step stair model, respectively) but low MDC's (ranging between 5.33% and 7.83%).P loss showed poor reliability (both absolute as relative, with ICC = 0.29, CV = -106.29%and MDC% = 29.01%).Total duration showed good reliability in the 6-and 12-step stair (CV = 5.38% and MDC% = 8.20% for the 6-step stair and CV = 4.49% and MDC% = 3.23% for the 12-step stair), but not in the 3-step stair model (CV = 12.60% and MDC% = 38.89%).

Discussion
This study investigated SC power on different stair models using a body-fixed sensor and showed that: 1) SC performance can be evaluated by means of three different parameters: the initial power on the first step or P 1 , the maximal power reached or P max , and the maintenance of power near the end of the stairs or P loss (only measurable on a regular flight of stairs); 2) P 1 and P max showed moderate to strong relationships within a 3-, 6-and 12-step staircase, while P loss was weakly related to P 1 and P max ; 3) P 1 and P max showed good reliability, whereas P loss showed poor reliability in our study sample of community-dwelling older adults.
Measuring lower-limb muscle power often involves isolated joint movements, which do not capture the complexity of daily-life movements [11].For implementing lower-limb muscle power measurements in daily-life practice, tests should be functional, easy and low-cost [13].Timed stair tests are frequently used in clinical practice [25].Based on timed SC, a mean power value can be estimated [14,19].However, SC is in reality a complex and dynamic activity of daily living which requires strength, coordination and balance [17].A lot of information is missing when estimating just one mean power value.When looking into the kinematics of SC, ascending a step involves two phases [15]: 1) a rise phase, where there is vertical velocity of the body; and 2) a stance phase, where there is no vertical velocity.Power is only produced in the rise phase of the step.
Accelerometers or body-fixed sensors allow us to analyze the SC movement in detail, as they can provide information about every single phase of every single step [15].SC was previously measured with body-fixed sensors [15,26].However, no study has investigated the evolution of SC power production per step.By plotting the mean power production from the first step to the last, we were able to differentiate between different phases in the SC performance.First, there is an acceleration phase or a phase where power is building up from zero in the first step, here defined as P 1 .Second, there is a phase where the maximal performance is reached or P max .And third, in a regular staircase with a sufficient amount of steps, we see a fatigue-related drop in power or P loss .This subdivision in three phases is in line with neuromuscular function assessments in isometric or dynamic tests, i.e. 1) rate of force/power development in the initial phase [27,28], 2) peak force/power production in the phase of maximal performance [28] and 3) a fatigue-related reduction in force/power [29].
As participants may change their movement pattern or speed between separate test sessions, test-retest reliability of SC performance parameters should be evaluated [15].We have previously reported excellent test-retest results for duration and P max on a 6-step stair model (ICC = 0.93-0.94and CV = 4.0-6.3%)[15].These results are in line with the current study, where we found good to excellent reliability for total duration, P mean and P max on the 6-and 12-step stair model.Even though P mean and P max also showed good reliability in the 3-step stair model, total duration was less reliable.This indicates that caution is advised when total duration on a short flight of stairs is used to estimate mean power values.In addition, P 1 appeared less reliable than P max or P mean , which is in line with previous reports on rate of force/power development, indicating that early phases of force/power production are more prone to variability [28,30].Furthermore, P loss showed poor reliability (ICC = 0.29, CV = -106.29%,MDC% = 29.01%) in our study sample and should not be used.
As most clinical practices do not have access to a full flight of stairs, the possibility to measure SC parameters on a smaller, and even transportable, staircase is valuable.Results showed that P 1 did not differ between a 3-, 6-and 12-step stair model.This indicates that subjects started the SC test in a similar way on the different staircases, regardless of the number of steps.Absolute values of P max were higher with increasing number of steps (P max 3<6<12 step stair), indicating that 3 to 6 steps are insufficient to reach the real P max .However, a high correlation was found between the P max values of the three different staircases (Pearson's r ranged between 0.82 and 0.93), indicating that the values on a smaller staircase are a good estimate of the P max reached on the 12-step stair.Similar values of MDC% values were found for the different parameters between the different staircases, supporting the idea that staircase models with a smaller number of steps can provide similar information as a regular staircase.As absolute values do not match between the different stair models, follow-up measurements of an individual should always be performed on the same stair model.
This study has some limitations.We have included a small sample of rather well-functioning older adults.As power is declining from middle-age onwards [5], the role of P 1 , P max and P loss should be examined further in a more diverse sample, including middle-aged adults and mobility-limited elderly.Assessing SC can be challenging in mobility-limited elderly and the risks of the test need to be balanced against the benefits.However, we propose the use of SC tests to detect early changes in muscle power prior to the initiation of mobility limitations, as it appears more sensitive to age-related declines than sit-to-stand tests [15].

Conclusions
Three different performance phases can be distinguished during SC: an acceleration phase where power is built up, a phase of maximal performance and a phase of power loss.Measurements of SC power production with a single sensor on the lower back are found to be reliable across different staircase models.A small, transportable, 3-step staircase can be used for measuring power production in clinical practices with no access to regular staircases.However, absolute values are clearly lower in a 3-step stair model compared to a regular staircase, indicating that measurements to track performance changes over time should always be done using the same stair model.Future research should investigate the predictive value of SC power production on loss of independence and negative health outcomes with aging.In the meanwhile, a better insight in the trainability of the different components of power during SC (P 1 , P max and P loss ) can clarify the potential of SC in exercise programs.

Fig 2
Fig 2  gives an overview of stair climbing performance, i.e. relative mean power per step, on the three different stairs.The drop in performance in the final two steps of the 6-step model (i.e., step 5 and 6) compared to step 5 and 6 of the 12-step model (both p<0.001) indicates that the performance is negatively influenced by a breaking phase near the end of the stair (and not by fatigue).The subjects seem to anticipate on the end of the stair, resulting in a reduction in

Fig 2 .
Fig 2. Stair-climbing (SC) power per step on a 3-, 6-and 12-step stair model.Dots and bars represent means and SEM's, respectively.The three different phases of SC (i.e., P 1 , P max and P loss ) are indicated for the 12-step stair.P 1 = power in step 1; P max = highest power production; P loss = loss in power.https://doi.org/10.1371/journal.pone.0296074.g002

Table 4 . Mean differences of stair-climbing performance parameters in the test-retest study sample (n = 53) and reliability coefficients
Data of 3 subjects in the retest were missing because of issues in the manufacturer's software.